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AN IN VIVO SCREEN USING CHEMICAL INDUCERS OF D IMERI ZAT I ON 

This application is a continuation-in-part of U.S. Serial No. 
09/490,320, filed January 24, 2000, now abandoned, the contents 
5 of which are hereby incorporated by reference. 

This invention has been made with government support under 
National Science Foundation grants CHE-9626981, CHE-9977402, and 
CHE-9984928. Accordingly, the U.S. Government has certain 
10 rights in the invention. 

Throughout this application, various publications are referenced 
by author or author and date. Full citations for these 
publications may be found listed alphabetically at the end of 

15 the specification immediately preceding the claims. The 
disclosures of these publications in their entireties are hereby 
incorporated by reference into this application in order to more 
fully describe the state of the art as known to those skilled 
therein as of the date of the invention described and claimed 

20 herein. 

Field of Invention 

The disclosed invention relates to 
vivo, and drug screening in vivo 
25 inducers of protein dimerization . 

Background of the Invention 

Initial efforts to design protein catalysts (enzymes) relied on 
the modification of individual proteins. (Kaiser 1984; Knowles 

30 1987) Despite some successes, (Wharton 1985; Wilks 1988; Hilvert 
1985, 1989, 1994; Imperial! 1994; Johnson 1993) protein 
engineering has proven to be difficult and has suggested that 
notions of how enzymes work may still be naive. Combinatorial 
techniques, which rely on generating and screening large pools 

35 of protein variants simultaneously, offer a promising new 
approach to enzyme design. (DeGrado 1997) Several 



the evolution of enzymes in 
through the use of chemical 
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straightforward methods exist for generating large libraries 
(>10 12 ) of proteins rapidly. (Reidharr-Olson 1991; Eisenbeis 
1985; Wells 1985; Zoller 1983; Leung 1989; Crameri 1998; Zhang 
1997, 1999; Stemmer 1994; Stemmer 1994) Methodologies for 
5 identifying protein catalysts from libraries of proteins 
primarily have been based not on catalysis, but on binding to 
a transition-state analog, (Wagner 1995, 1998; Shokat 1989) as 
in the case of catalytic antibodies (Schultz 1989; Schultz 1995; 
Hilvert 1985, 1989, 1994; Posner 1994) and phage-display (Baca 
10 1997) While antibodies clearly can catalyze a broad range of 
reactions, there are few reports (Jacobsen 1992) that selections 
for binding can generate catalysts that rival natural enzymes. 
In vivo complementation of essential enzymes, such as chorismate 
mutase and triosephosphate isomerase, offers a direct selection 
15 for catalysis but is limited to existing reactions. (Hermes 
1990; Kast 1996) General screens and selections for catalysis 
are beginning to be reported. (DeGrado 1997; Koltermann 1998; 
Pedersen 1998) 

20 Combinatorial techniques allow structure-activity relationships 
of enzymes to be amassed quickly. With the aid of powerful 
selections it should be possible to create synthetically useful 
catalysts for pharmaceuticals and materials. However, as with 
proteins, it is difficult to design screens for non-protein 

25 catalysts. 

Screens have been developed based on small-molecule inducible 
gene expression. Several systems for small-molecule inducible 
gene expression have been developed to the point that they are 

30 integral to basic research. The discovery that the lac operon 
is induced by binding of lactose to the lac repressor led to the 
widespread use of isopropyl-b-D-thiogalactoside (IPTG) to induce 
gene expression in bacteria. More recently it has been shown 
that by fusing the tet repressor to a eukaryotic transcription 

35 activation domain, gene expression in eukaryotes can be both 
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negatively and positively regulated using tetracycline. (Gossen 
1992, Gossen 1995) The demonstration that transgene expression 
can be regulated with tetracycline in transgenic mice highlights 
the utility of this system. In addition to the tetracycline- 
5 based system, ecdysone-, (No) estrogen-, (Braselman 1993) and 
progesterone-regulated systems (Wang 1994) have been reported. 

An extension of these strategies resulted from studies of the 
mechanism of action of the immunosuppressants FK506 and 
10 rapamycin. (Rosen 1992) It was found that the biological 
activity of both compounds resulted from the fact that they each 
dimerize two proteins, FKBP12 and calcineurin or FKBP12 and 
FRAP, that otherwise do not interact. One portion of FK506 
binds to FKBP12 and another to calcineurin. Based on this 
understanding, it was demonstrated that these molecules could 
be used to control protein oligomerization inside a cell. 

Molecules such as FK506 are small molecule 'dimerizers' 
(sometimes referred to as chemical inducers of dimerization, 
CIDs) that activate the function of numerous proteins that 
regulate many important cellular processes. Dimerizers allow 
the functions of proteins to be explored even when small 
molecule ligands are unknown. A limited number of such reagents 
have been synthesized that control the function of a much larger 
number of proteins (expressed as fusions of proteins of interest 
linked to a small molecule-responsive dimerization domain) . 
See, e.g. Austin 1994, Belshaw 1996, Choi 1996, Crabtree 996, 
Diver 1997, Ho 1996, Holsinger 1995, Hung 1996, Klemm 1998, 
Liberies 1997, Pruschy 1994, Schreiber 1998, Spencer 1996, 
Spencer 1995, Spencer 1993, Stockwell 1998, and Yang 1998. 
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To generalize this approach, it was shown in 1993 that two FK506 
molecules tethered via their C 21 -allyl groups could oligomerize 
proteins fused to FKBP12 . Specifically, several FK506 dimers 
termed "FKl012s" were shown to oligomerize the cytoplasmic 



domain of T-cell receptors when these domains were fused to the 
FK506-binding protein FKBP12. Since this initial paper, there 
have been several important extensions of this work by Schreiber 
and coworkers. Belshaw et al. reported in 1996 that two 
5 different proteins could be dimerized by tethering FK506 to 
cyclosporin. In 1997 Diver and Schreiber demonstrated a two- 
step synthesis of an FK1012 molecule based on recent olefin 
metathesis chemistry . 

10 While this work with FK506 establishes a powerful new approach 
for manipulating cellular function with small molecules, 
optimized chemical handles that are more convenient to work with 
than FK506 are critical for realizing the potential of this 
approach. FK506 (Fig. 5B) is cell permeable and has excellent 

15 pharmacokinetic properties — as evidenced by its widespread use 
as an immunosuppressant. FK506, however, is not an ideal 
chemical handle. FK506 is not available in large quantities, 
coupling via the C 21 allyl group requires several chemical 
transformations including silyl protection of FK506, (Spencer 

20 1993, 1995, 1996; Pruschy 1994) and FK506 is both acid and base 
sensitive. (Wagner 1998; Coleman 1989) 

One very recent approach to replacing FK506 is to design 
synthetic ligands that also bind to FKBP12 with high affinity. 

25 In 1997 Amara et al. reported AP1510, a synthetic dimerizer that 
binds FKBP12 with high affinity and that can oligomerize 
proteins fused to FKBP12. Very recently a derivative of AP1510, 
"5S ,f , was prepared that binds with high affinity to a FKBP12 
mutant. (Clackson 1998) This derivative is particularly 

30 interesting because it does not bind with high affinity to wild 
type FKBP12. 

Recently a system has been reported, named the yeast three- 
hybrid system, for detecting ligand-receptor interactions in 
35 vivo. (Licitra, represented in Fig 2; U.S. Patent No. 

5,928,868) This system is based on the principle that small 
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ligand-receptor interactions underlie many fundamental processes 
in biology and form the basis for pharmacological intervention 
of human diseases in medicine. This system is adapted from the 
yeast two-hybrid system by adding a third synthetic hybrid 
ligand. The feasibility of this system was demonstrated using 
as the hybrid ligand a dimer of covalently linked dexamethasone 
and FK506. The system used yeast expressing fusion proteins 
consisting of a) hormone binding domain of the rat 
glucocorticoid receptor fused to the LexA DNA-binding domain and 
b) FKBP12 fused to a transcriptional activation domain. When 
the yeast was plated on medium containing the dexamethasone- 
FK506 heterodimer, the reporter genes were activated. The 
reporter gene activation is completely abrogated in a 
competitive manner by the presence of excess FK506. Using this 
system, a screen was performed of a Jurkat cDNA library fused 
to the transcriptional activation domain in yeast in the 
presence of a methasone-FK506 heterodimer. The yeast in this 
system expressed the hormone binding domain of rat 
glucocorticoid receptor/DNA binding domain fusion protein. 
Overlapping clones of human FKBP12 were isolated. These results 
demonstrate that the three-hybrid system can be used to discover 
receptors for small ligands and to screen for new ligands to 
known receptors . 



Other approaches, which do not rely on a readout based on 
alterations in genetic expression, have also been developed. 
WO 96/30540 (Tsien et al.) discloses a screen for p-lactamase 
activity that uses fluorescence resonance energy transfer as the 
indicator of |3-lactamase activity. The degree of fluorescence 
in this screen depends on the level of ^-lactamase activity. 
Detection of p-lactamase activity relies on detection of changes 
in the degree of fluorescence. 



However, it has not heretofore been suggested to use small 
molecule induced protein dimerization to screen for catalysis 
in vivo. , and specifically, it has not been suggested to use an 
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enzyme cleavable moiety to link two molecules to dimerize 
proteins . 

This invention provides proteins de novo with prescribed binding 
5 and catalytic properties and permits screening cDNA libraries 
based on biochemical function. Being able to understand and 
manipulate protein-small molecule interactions has broad 
implications for basic biomedical research and the 
pharmaceutical industry. Proteins engineered to have unique 

10 binding or catalytic properties have already proven useful as 
biomedical reagents, medical diagnostics, and even therapeutics. 
As with site-directed mutagenesis before it, randomization and 
screening techniques also offer an entirely new approach to 
understanding the molecular basis for recognition and catalysis. 

15 Technically, a high-throughput approach such as that disclosed 
here would speed-up the research because the activity of 
thousands of protein variants can be measured simultaneously. 
Practically, we believe that powerful screens in combination 
with existing randomization techniques will make it possible to 

20 take an existing protein fold and "evolve" it into an enzyme 
with a new function generating useful catalysts for the 
pharmaceutical and chemical industries. Intellectually, the 
ability to modify substrate specificity and catalytic activity 
offers a new standard for "understanding" how enzymes function. 

25 A powerful screen is also critical to the end goal of genome 
sequencing efforts-determining the function of each and every 
protein, bypassing decades of detailed biochemical and genetic 
experiments to unravel complex biochemical pathways. Since the 
screen is done in vivo and in both prokaryotes and eukaryotes, 

30 the methodology can be applied to functional genomics and drug 
discovery. A cDNA library can be screened for all enzymes that 
form or cleave a specific type of bond. A library of small 
molecules can be screened for its ability to inhibit a specific 
enzyme- The screen selects for cell permeability, compatibility 

35 with the cellular milieu, and inhibition of enzyme activity. 
The key to all of these applications is a robust screen for 




enzymatic activity such as that disclosed here. 
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Summary of the Invention 

The subject invention provides a compound having the formula; 



H1-X-B-Y-H2 

5 

wherein each of HI and H2 may be the same or different and 

capable of binding to a receptor which is the same or different; 

wherein each of X and Y may be present or absent and if present, 

each may be the same or different spacer moiety; and wherein B 
10 is an enzyme cleavable moiety. This invention also provides a 

method of screening proteins for the ability to catalyze bond 

cleavage, comprising the steps of: 

a) providing a cell that expresses a pair of fusion 

proteins which upon dimerization change a cellular readout; 
15 b) providing the compound of the invention which dimerizes 

the pair of fusion proteins, said compound comprising two 

portions coupled by a bond that is cleavable by the protein to 

be screened; and 

c) screening for the cellular readout, wherein a change the 
20 cellular readout indicates catalysis of bond cleavage by the 

protein to be screened. Finally, the invention also provides 

a method of screening proteins for the ability to catalyze bond 

formation, comprising the steps of: 

a) providing a cell that expresses a pair of fusion 
25 proteins which upon dimerization activate a cellular readout: 

b) providing a first compound and a second compound, each 
being capable of binding to one of the pair of fusion proteins, 
said first and second compound comprising a portion through 
which the first and second compounds are coupled to form the 

30 inventive compound by the action of the bond forming protein to 
be screened; and 

c) screening for the cellular readout, wherein a change in 
the cellular readout indicates catalysis of bond formation by 
the protein to be screened. 



35 
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Description of the Figures 

Figure 1. The selection strategy. Proteins V and W do not 
interact (A) until a BOND links the handles HI and H2 (B) . The 
selection can be run in the forward direction to select for BOND 
5 formation or the reverse direction to select for BOND cleavage. 

Figure 2. The yeast three-hybrid system. The small molecule 
dexamethasone-FK506 (H1-H2) mediates the dimerization of the 
LexA-GR (glucocorticoid receptor) and B42-FKBP12 protein 
10 fusions. Dimerization of the DNA-binding protein LexA and the 
activation domain B42 activates transcription of the lacZ 
reporter gene . 

Figure 3. The Model reaction. Cephalosporin hydrolysis by the 
15 908R cephalosporinase . 

Figure 4. DEX-CEPHEM-FK50 6 retrosynthesis. Cephem 1 is 

commercially available. DEX-C0 2 H is prepared via oxidation of 
the C 20 ^-hydroxy ketone; FK506-CO 2 H, via a cross-metathesis 
20 reaction with the C 21 allyl group. 

Figure 5. The chemical handles dexamethasone (A), FK506 (B) , 
and methotrexate (C) . 

25 Figure 6. The dexamethasone-methotrexate molecules synthesized. 
The diamine linkers are commercially available and vary in 
length and hydrophobicity . 

Figure 7. The Claisen rearrangement (A) and the Diels-Alder 
30 reaction (B) are both pericyclic reactions with six-membered 
transition states . 

Figure 8. The retro-synthesis of the diene (A) and the 
dienophile (B) . A Curtius rearrangement is used to introduce 
35 the carbamyl linkage to HI in the diene. (Overman 1978) A Stille 
coupling is used to introduce the alkyl linkage to H2 in the 
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dienophile. (Duchene 1994) The cyclohexene product will be 
prepared through the cycloaddition of. these two compounds. 

Figure 9. Examples of DEX-DEX molecules synthesized to date. 

5 

Figure 10. DEX-MTX retrosynthesis. 

Figure 11. Maps of the plasmids encoding the LexA-GR and B42-GR 
fusion proteins. 

10 

Figure 12. Dex-cephem-Mtx retro-synthesis. 

Figure 13. Dex-Mtx protein dimerization system. A cell- 
permeable Dex-Mtx molecule is used to induce dimerization of 
15 LexA-GR and DHFR-B4 2 protein chimeras, activating transcription 
of a lacZ reporter gene. 

Figure 14. Cell based assays. Yeast cells containing LexA-GR 
and B42-DHFR fusion proteins and the lacZ reporter gene are 
20 grown on X-gal plates with or without Dex-Mtx. Dex-Mtx 
dimerizes the fusion proteins, activating lacZ transcripiton, 
hydrolyzing the chromogenic substrate X-gal, and turning the 
cells blue. Dex-Mtx is added directly to the media in the x-gal 
plate. The assay takes two to five days. 

25 

Figure 15. X-gal plate assay of Dex-cephem-Mtx induced lacZ 
transcription. Yeast strains containing different LexA- and B42 
chimeras, plus a lacZ reporter gene, were grown on X-gal 
indicator plates with or without Dex-cepehem-MTX compounds: A, 

30 1 jiM Dex-MTX; B, 10 p,M Dex-cepehem-MTX; C, no small molecule. 
The strains that are dark (blue in original) even in the absence 
of small molecule (plate C) are positive controls on protein- 
protein interaction. The dark strains on plates A and B express 
LexA DHFR and B42-GR fusion proteins, and the white strains are 

35 negative controls, expressing only LexA and B42. 
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Figure 16A. Plate BTC4 grown on 4 different plates after 72 
hours. One plate has no small molecule, so just the positive 
controls should be dark. The other three plates all have either 
10 uM DM1, 10 uM D8M, or 10 uM D10M. Figure 16B is the plate 
5 map for plate BTC4. 

Figure 17A. Plate BTC6 grown on 4 plates after 56 hours. Twotop 
plates contain no small molecule, and the bottom two plates 
contain 10 uM D10M. Figure 17B shows plate BTC6 grown on 2 
10 plates after 60 hours. Both plates contain 1 uM D8M. Figure 17C 
shows the plate map for plate BTC6. 

Figure 18. The p-galactosidase activity of strain V494Y using 
varying concentrations of D8M. 

15 

Figure 19. A screen for glycosidase activity. Dex-Mtx CIDs with 
cleavable oligosaccharide linkers used to assay the >3000 
proteins in S. cerevisiae of unknown function for glycosidase 
activity. A yeast cDNA library is introduced into the selection 
20 strain. Only cells expressing active glycosidases cleave the 
oligosaccharide linker, disrupt ura3 transcription, and survive 
in the presence of 5-FOA. 

Figure 20. Proposed solid-phase synthesis of the Dex-Mtx 
25 glycosidase substrates. While the synthesis of Dex- (GlcNAc) 4 - 
Mtx is shown, the synthesis is designed to allow the 
introduction of a variety of sugar monomers with both regio- and 
stereo-control . 
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Detailed Description of the Invention 

The subject invention provides a compound having the formula: 



wherein each of HI and H2 may be the same or different and 
capable of binding to a receptor which is the same or different; 
wherein each of X and Y may be present or absent and if present, 
each may be the same or different spacer moiety; and wherein B 
10 is an enzyme cleavable moiety. 

In one embodiment, each of HI and H2 is capable of binding to 
a receptor with a IC 50 of less than 100 nM. In a preferred 
embodiment, each of HI and H2 is capable of binding to a 
15 receptor with a IC 50 of less than 10 nM. In the most preferred 
embodiment, each of HI and H2 is capable of binding to a 
receptor with a IC 50 of less than 1 nM. 

In one embodiment, B is capable of binding to an enzyme with an 
20 IC 50 of less than 100 mM. In a further embodiment, B is capable 
of binding to an enzyme with an IC 50 of less than 10 mM. In yet 
a further embodiment, B is capable of binding to an enzyme with 
an IC 50 of less than 1 mM. In a preferred embodiment, B is 
capable of binding to an enzyme with an IC 50 of less than 100 jaM, 
25 more preferably, B is capable of binding to an enzyme with an 
IC 50 of less than 10 jaM, and most preferably, B is capable of 
binding to an enzyme with an IC 50 of less than 1 juM. 

In another embodiment, either of HI and H2 are different, or X 
30 and Y are different. 

Furthermore, B may be cleavable by an enzyme selected from the 
group of enzymes consisting of transferases, hydrolases, lyases, 
isomerases, and ligases . 



HI - 



X - 



B - Y - H2 



5 
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The transferase is selected from the group consisting of, a one 
carbon transferase, an aldehyde or ketone transferase, an acyl 
transferase, a glycosyl transferase, an alkyl or aryl 
trasf erase, a N-containing group transferase, a P-containing 
group transferase, an S-containing group transferase, an O- 
containing group, and a Se-containing group transferase. 

The hydrolase is selected from the group consisting of an ester 
hydrolase, a glycosidic hydrolase, an ether hydrolase, a peptide 
hydrolase, a C-N (non-peptide) hydrolase, an acid anhydride 
hydrolase, a C-C hydrolase, a P-N hydrolase, an S-N hydrolase, 
a C-P hydrolase, and an S-S hydrolase. 

The lyase is selected from the group consisting of a C-C lyase, 
a C-0 lyase, a C-N lyase, ' a C-S lyase, and a P-0 lyase. 

The isomerase is selected from the group consisting of 

racemases, epimerases, cis- trans isomerases, intra- 

oxidoreductases, intra-transf erases (mutases), and 
20 intramolecular lyases . 

The ligase is selected from the group consisting of a C-0 
ligase, a C-S ligase, a C-N ligase, a C-C ligase, and a P-0 
ligase . 

25 

In a preferred embodiment, B is an enzyme cleavable moiety 
selected from the group consisting of phosphodiester , glycoside, 
amide, ester, diester, aldol product, and acetate moiety. In 
a most preferred embodiment B represents an amide moiety, or a 
30 cephem moiety. 

Each of HI or H2 may be derived from a compound selected from 
the group consisting of steroids, hormones, nuclear receptor 
ligands, cofactors, antibiotics, sugars, enzyme inhibitors, and 
35 drugs. 
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Each of HI and H2 may also represent a compound selected from 
the group consisting of dexamethasone, 3, 5, 3 ' -triiodothyronine, 
trans-retinoic acid, biotin, coumermycin, tetracycline, lactose, 
methotrexate, FK506, and FK506 analogs. 

In a preferred embodiment, each of HI and H2 is derived from the 
compound of Figure 5A, or the compound of Figure 5B, or the 
compound of Figure 5C. 

The compound HI - X - B - Y - H2 may be formed in a cell the 
reaction of a first compound having the formula: 

Hl-X-B' 

with a second compound having the formula: 

H2-Y-B' ' 

wherein B' and B" are moieties that react to form B in the 
presence of an enzyme. 

The enzyme in this reaction may be selected from the group of 
enzymes consisting of transferases, lyases, isomerases, and 
ligases. Each one of the transferases, lyases, isomerases, and 
ligases comprises groups as noted above. 

This invention also provides a compound having the formula: 

Hl-X-B' 

wherein HI is capable of binding to a receptor; 

wherein X is a spacer moiety which may be present or absent; and 
wherein B' is a moiety capable of binding to an enzyme. 

HI should be capable of binding to a receptor with a IC 50 of less 
than 100 nM, preferably HI is capable of binding to a receptor 
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with a IC 50 of less than 10 nM, more preferably HI is capable of 
binding to a receptor with a IC 50 of less than 1 nM. 

B' should be capable of binding to an enzyme with an IC 50 of less 
5 than 100 mM, preferably B' is capable of binding to an enzyme 
with an IC 50 of less than 50 mM, more preferably B' is capable 
of binding to an enzyme with an IC 50 of less than 1 mM, yet more 
preferably B' is capable of binding to an enzyme with an IC 50 of 
less than 100 /uM, yet more preferably B' is capable of binding 
10 to an enzyme with an IC 50 of less than 10 juM, and most preferably 
B' is capable of binding to an enzyme with an IC 50 of less than 
1 yuM. 

The compound Hl-X-B' may react with a moiety which has the 
15 formula: 

H2-Y-B' ' 

wherein H2 is capable of binding to a receptor; wherein Y is a 
20 spacer moiety which may be present or absent; wherein B' ' is a 
moiety that reacts with B' in the presence of the enzyme. 

This invention also provides a complex comprising the compound 
having the formula H1-X-B-Y-H2 complexed to an enzyme. 

25 

In the complex, the compound is capable of binding to the enzyme 
with an IC 50 of less than 100 mM, preferably with an IC 50 of less 
than 10 mM, more preferably with an IC 50 of less than 1 mM, yet 
more preferably with an IC 50 of less than 100 jaM, yet more 
30 preferably with an IC 50 of less than 10 |uM, and most preferably 
with an IC 50 of less than 1 |uM. 

This invention also provides a complex comprising the compound 
having the formula Hl-X-B' complexed to an enzyme. 

35 
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In. this complex, the compound is capable of binding to the 
enzyme with an IC 50 of less than 100 mM, preferably with an IC 50 
of less than 10 mM, more preferably with an IC 50 of less than 1 
mM, yet more preferably with an IC 50 of less than 100 |iM, yet 
more preferably with an IC 50 of less than 10 jiM, and most 
preferably with an IC 50 of less than 1 ^M. 



This invention also provides 
compound having the formula HI - 
having the formula Hl-X-B' . 
comprise an enzyme. 



a composition comprising the 
X - B - Y - H2, or the compound 
The composition may further 



This invention also provides a composition comprising the 
complex of the compound having the formula H1-X-B-Y-H2 
with an enzyme, or of the compound having the formula Hl-X-B' 
with an enzyme . 

This invention also provides a method of screening proteins for 
the ability to catalyze bond cleavage, comprising the steps of: 

a) providing a cell that expresses a pair of fusion 
proteins which upon dimerization change a cellular readout; 

b) providing a compound which dimerizes the pair of fusion 
proteins, said compound comprising two portions coupled by a 
bond that is cleavable by the protein to be screened; and 

c) screening for the cellular readout, wherein a change the 
cellular readout indicates catalysis of bond cleavage by the 
protein to be screened. 

The cellular readout may be reconstitut ion of enzymatic 
activity. 



The method further provides a cell that contains a gene which 
is activated by a dimerized pair of fusion proteins. The pair 
of fusion proteins are dimerized by a compound having the 
formula Hl-X-B-Y- H2 . 
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The cellular readout may also be gene transcription, such that 
a decrease of gene transcription indicates catalysis of bond 
cleavage by the protein to be screened. 

The gene transcribed may be lacZ , leu2 , ura3 , his3 , or trp. 

This invention also provides a method of screening proteins for 
the ability to catalyze bond formation, comprising the steps of: 

a) providing a cell that expresses a pair of fusion 
proteins which upon dimerization activate a cellular readout: 

b) providing a first compound and a second compound, each 
being capable of binding to one of the pair of fusion proteins, 
said first and second compound comprising a portion through 
which the first and second compounds are coupled by the action 
of the bond forming protein to be screened; and 

c) screening for the cellular readout, wherein a change in 
the cellular readout indicates catalysis of bond formation by 
the protein to be screened. 

The cellular readout may be enzyme activity. 

The method further comprises providing a cell that contains a 
gene which is activated by the dimerized pair of fusion 
proteins . 

The cellular readout may be gene transcription, such that an 
increase in gene transcription indicates catalysis of bond 
formation by the protein to be screened. 

In this method, either the first or the second compound is the 
compound having the formula Hl-X-B' . 

In either of the methods of this invention, the cell is selected 
from the group consisting of yeast, bacteria or mammalian. The 
cell may be selected from the group consisting of S. cerevlsiae, 
and E. coll. 
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The pair of fusion proteins is the rat glucocorticoid receptor 
(rGR2), or binding domain thereof, fused to Lex A, and FKBP12 
fused to the B42 transcriptional activation domain. 

5 The pair of fusion proteins may also be the dihydrof olate 
reductase ( DHFR) , or binding domain thereof, fused to Lex A, and 
FKBP12 fused to the B42 transcriptional activation domain. 
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The pair of fusion proteins may further be dihydrof olate 
reductase (DHFR), or binding domain thereof, fused to LexA, and 
the rat glucocorticoid receptor (rGR2), or binding domain 
thereof, fused to the B42 transcriptional activation domain. 

The pair of fusion proteins may yet further be the rat 
glucocorticoid receptor (rGR2) , or binding domain thereof, fused 
to LexA, and dihydrof olate reductase (DHFR)', or binding domain 
thereof, fused to the B42 transcriptional activation domain. 

Finally, the pair of fusion proteins may yet even further be 
dihydrofolate reductase (DHFR), or binding domain thereof, fused 
to LexA, and the rat glucocorticoid receptor (rGR2) , or binding 
domain thereof, fused through a 6-Glycine linker to the B42 
transcriptional activation domain. 

In either method the protein to be screened is an enzyme 
selected from the group of enzyme classes consisting of 
transferases, hydrolases, lyases, isomerases and ligases. 

In either method, the screening is performed by Fluorescence 
Associated Cell Sorting (FACS), or gene transcription markers 
selected from the group consisting of Green Fluorescence 
Protein, iacZ-p-galagctosidases, luciferase, antibiotic 
resistant p-lactamases, and yeast markers. 



35 



is invention also provides a method of screening a compound 
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for the ability to inhibit an enzyme comprising: 

screening for activity of the enzyme by the method 
disclosed herein, and obtaining cells which express an active 
enzyme, and 

contacting the cells with the drug to be screened, wherein 
a change in the transcription of the reporter gene within the 
cell after contact with the drug indicates inhibition of the 
enzyme by the drug . 

This invention also provides a drug for the inhibition of an 
enzyme selected by this method. 

This invention further provides a method of evolving a protein 
with a new catalytic activity comprising screening proteins 
derived from a library of proteins which are mutants of a known 
protein, using either of the screening methods provided by this 
invention . 

Thus, this invention also provides a protein with new catalytic 
activity evolved by this method. 

This invention also provides a method of evolving an enzyme with 
a new substrate specificity comprising screening enzymes derived 
from a library of enzymes which are mutants of an enzyme with 
known substrate specificity, using either of the screening 
methods provided by this invention. 

Thus, this invention provides an engineered enzyme having new 
substrate specificity evolved by this method. 

This invention also provides a method for evolving an enzyme 

that functions with a cofactor which is different from the 

cof actor the natural coenzyme uses, comprising: 

evolving mutants of the natural coenzyme; and 

screening the mutants of the natural coenzyme in the 

presence of a cofactor different from the cofactor of the 
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natural enzyme, using either of the screening methods provided 
by this invention . 

Thus, this invention provides an engineered enzyme that 
functions with a cofactor which is different from cofactors the 
enzymes naturally uses evolved by this method. 

The foregoing embodiments of the subject invention may be 
accomplished according to the guidance which follows. Certain 
of the foregoing embodiments are exemplified. Sufficient 
guidance is provided for a skilled artisan to arrive at all of 
the embodiments of the subject invention. 

Also disclosed is a compound having the formula: 

H1-Y-H2 

wherein HI is methorexate or an analog thereof; 

wherein H2 is capable of binding to a receptor, and 

wherein Y is a moiety providing a covalent linkage between HI 

and H2, which may be present or absent, and when absent, HI is 

covalently linked to H2 . H2 may be Dex or an analog thereof. 

When HI is Mtx, then H2 may be Dex or an analog thereof. 

The compound may also have the formula Mtx-Y-H2, and the formula 
Dex-Y-Mtx. The compound may also have the formula: 
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In the compound, wherein H2 may be capable of binding to a 

receptor with an IC50 of less than 100 mM; or an IC50 of less 

than 10 mM; or an IC50 of less than 1 mM; or an IC50 of less 

than 100 pM; or an IC50 of less than 10 uM; or an IC50 of less 

than 1 uM; or an IC50 of less than 100 nM; or an IC50 of less 
than 10 nM; or an IC50 of less than 1 nM. 



The compound may have the formula: 
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Also disclosed is a complex between the compound having the 
formula H1-Y-H2 and a fusion protein which comprises a binding 
domain capable of binding to methotrexate, wherein HI of the 
compound binds to the binding domain of the fusion protein. The 
5 binding domain may be that of the DHFR receptor. 

In complex, HI is capable of binding to the binding domain of 
the fusion protein with an IC50 of less than 100 nM; or an IC50 
of less than 10 nM; or an IC50 of less than 1 nM; or an IC50 of 
10 less than 100 pM; or an IC50 of less than 10 pM; or an IC50 of 
less than 1 pM. 

In the complex, the fusion protein may be DHFR-LexA, or DHFR- 
B42 . 

15 

Also disclosed is a cell comprising the complex. 

Also disclosed is a method of dimerizing two fusion proteins 
inside a cell using the compound having the formula H1-Y-H2, 

20 comprising the steps of a) providing a cell that expresses a 
first fusion protein which comprises a binding domain that binds 
to HI and second fusion protein which comprises a binding domain 
that binds to H2, and b) contacting the compound having the 
formula H1-Y-H2 with the cell so as to dimerize the two fusion 

25 proteins. 

In the method, the first fusion protein or the second fusion 
protein may be DHFR- ( DNA-binding domain); or the first fusion 
protein or the second fusion protein may be DHFR- (transcription 
30 activation domain) . 

Also in the method, the first fusion protein or the second 
fusion protein may be DHFR-LexA; or first fusion protein or the 
second fusion protein is DHFR-B42. 
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Also disclosed is a method for identifying a molecule that binds 
a known target in a cell from a pool of candidate molecules, 
comprising : 

(a) covalently bonding each molecule in the pool of 
candidate molecules to a methotrexate moiety or an analog of 
methotrexate to form a screening molecule; 

(b) introducing the screening molecule into a cell which 
expresses a first fusion protein comprising a binding domain 
capable of binding methotrexate, a second fusion protein 
comprising the known target, and a reporter gene wherein 
expression' of the reporter gene is conditioned on the proximity 
of the first fusion protein to the second fusion protein; 

(c) permitting the screening molecule to bind to the first 
fusion protein and to the second fusion protein so as to 

15 activate the expression of the reporter gene; 

(d) selecting which cell expresses the reporter gene; and 

(e) identifying the small molecule that binds the known 
target . 
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In the method, the cell may be selected from the group 
consisting of insect cells, yeast cells, mammalian cell, and 
their lysates . The first or the second fusion protein may 
comprise a transcription module selected from the group 
consisting of a DNA binding protein and a transcriptional 
activator. Also, the molecule may be obtained from a 
25 combinatorial library. 



Steps (b)-(e) of the method may be repeated iteratively in the 
presence of a preparation of random small molecules for 
competitive binding with the hybrid ligand so as to identify a 
molecule capable of competitively binding the known target. 



Also disclosed is a method for identifying a protein target to 
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which a molecule is capable of binding, comprising: 

(a) providing a screening molecule comprising a 
methotrexate moiety or an analog of methotrexate covalently 
bonded to a ligand which has a specificity for an unknown 

5 protein target; 

(b) introducing the screening molecule into a cell which 
expresses a first fusion protein comprising a binding domain 
capable of binding methotrexate, a second fusion protein 
comprising the unknown protein target, and a reporter gene 

10 wherein expression of the reporter gene is conditioned on the 
proximity of the first fusion protein to the second fusion 
protein; 

(c) permitting the screening molecule to bind to the first 
fusion protein and to the second fusion protein so as to 

15 activate the expression of the reporter gene; 

(d) selecting which cell expresses the reporter gene; and 

(e) identifying the unknown protein target. 

In the method, the unknown protein target may be encoded- by a 
20 DNA from the group consisting of genomicDNA, cDNA and 
syntheticDNA. The ligand may have a known biological function. 
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Evolution of a Novel enzyme 

We begin with an evolution of a novel enzyme. While it is 
desirable to evolve novel enzymes using combinatorial 
techniques, the major barrier to applying combinatorial 
techniques to catalyst design is the lack of selections that are 
both general and depend directly on catalysis. The following 
disclosure provides a selection based on in vivo assays for 
protein-protein interactions, generally represented in Fig. 1. 

Selection Strategy 

The selection strategy is based on existing methods for 
controlling protein dimerization in vivo using small molecules 
(Fig. 1) . Several "chemical inducers of dimerization" have been 
reported showing that protein dimerization can be bridged by 
small molecules. (Spencer 1996, 1995, 1993; Crabtree 1996) 
Moreover, a number of techniques exist for translating the 
dimerization of two proteins to an in vivo screen or selection. 
(Hu 1990; Hu 1995; Fields 1989; Gyuris 1993; Johnsson 1994; 
Rossi 1997; Karimova 1998) Taken together, this work establishes 
that it is feasible to use a small molecule H1-H2 to dimerize 
two fusion proteins, reporter V-Hl receptor and reporter W-H2 
receptor, generating a cellular read-out. 

It has also been established that enzyme activity can be 
monitored, in vivo, and new enzymes can be screened for, in 
vivo, by splitting peptides . (Baum 1990, Smith 1991, Kamada 
1998, Hawkins 1999) . 

Disclosed is the dimerization of two proteins via covalent 
coupling of HI and H2 as the basis for a general selection for 
catalysis. That is, the small-molecule H1-X-BOND-Y-H2 

represented in Fig. 1 is used to mediate protein dimerization 
and hence a cellular signal. Then the enzyme that catalyzes 
either BOND formation or BOND cleavage is selected. The 
catalyst is tied to the cellular "read-out" because only cells 
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containing an active enzyme have the desired phenotype. 

The strategy is both general and a direct selection for 
catalysis. The selection can be applied to a broad range of 
5 reactions because protein dimerization depends only on HI and 
H2, not X, Y, or the BOND being formed or cleaved. It is a 
direct selection for catalysis because covalent coupling of HI 
and H2 is necessary for protein dimerization. Also, unlike 
catalytic antibodies, this strategy does not limit the starting 
10 protein scaffold. 
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Preparation and design of handles "HI" and "H2" 
Ideally, a chemical handle should bind its receptor with high 
affinity U 100 nM) , cross cell membranes yet be inert to 
modification or degradation, be available in reasonable 
quantities, and present a convenient side-chain for routine 
chemical derivatization that does not disrupt receptor binding. 
Again, we build from DEX-FK506 (H1-H2) mediated dimerization of 
LexA-rGR and B42-FKBP12 (Fig. 2) (Licitra; U.S. Patent No. 
5, 928, 868) . 

Dexamethasone (DEX) is a very attractive chemical handle HI 
(Fig. 5A). DEX binds rat glucocorticoid receptor (GR) with a 
K D of 5 nM, (Chakraborti 1991) can regulate the in vivo 
activity and nuclear localization of GR fusion proteins (Picard 
1987), and is commercially available. Affinity columns for rGR 
have been prepared via the C 20 -hydroxy ketone of 

dexamethasone. (Govindan 1980; Manz 1983) 

The antibacterial and anticancer drug methotrexate (MTX) is used 
in place of FK506 as the chemical handle H2 (Fig. 5B, 5C) . 
FK506 is not available in large quantities, coupling via the C 21 
allyl group requires several chemical transformations including 
silyl protection of FK506, (Spencer 1993, 1995, 1996; Pruschy 
1994) and FK506 is both acid and base-sensitive. (Wagner 1995, 
1998; Coleman 1989) MTX, on the other hand, is commercially 
available and can be modified selectively at its y-carboxylate 
without disrupting dihydrof olate reductase (DHFR) binding. 
(Kralovec 1989; Bolin 1982) Even though MTX inhibits DHFR with 
PM affinity, (Bolin 1982; Sasso 1994) both E. coli and S. 
cerevisiae grow in the presence of MTX when supplemented with 
appropriate nutrients. (Huang 1992) 

The ability of DEX-MTX to mediate the dimerization of LexA-rGR 
and B42-DHFR is tested by (1) synthesis of a series of DEX-MTX 
molecules with simple diamine linkers (Fig. 6); and (2) showing 
that DEX-MTX can dimerize LexA-rGR and B42-DHFR based on lacZ 
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transcription and that both DEX and MTX uncoupled, can, 
competitively disrupt this dimerization . Cell permeable 
chemical handles that can be prepared readily and that are 
efficient at inducing protein dimerization not only are 
essential to the robustness of this selection methodology but 
also should find broad use as chemical inducers of protein 
dimerization . 
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Dexamethasone (DEX) and the glucocorticoid receptor (GR) present 
10 a particularly attractive chemical handle/ receptor pair. 
Dexamethasone is the cortical steroid with the highest affinity 
for the rat Glucocorticoid Receptor. The rGR binds DEX with a 
K D of 5 nM, and mutants of rGR have been isolated with up to 10- 
fold higher affinity for DEX. (Chakraborti 1991) The steroid 
dexamethasone has been used extensively as a cell-permeable 
small molecule to regulate the in vivo activity and nuclear 
localization of GR fusion proteins. (Picard 1987) This work 
firmly establishes that DEX is cell permeable and is not 
modified or broken down in the cell. Recently, there has been 
one report of a yeast "three-hybrid" system in which a GR-DNA- 
binding protein fusion and a FKBP12-transcription activation 
domain fusion could be dimerized by the small molecule DEX-FK506 
(Fig. 2). Dexamethasone is commercially available in large 
quantities. Affinity columns for rGR have been prepared via 
25 oxidation of the C 20 a-hydroxy ketone of DEX to the 
corresponding carboxylic acid. (Govindan 1980, Manz 1983) 

Methotrexate (MTX) inhibition of dihydrof olate reductase (DHFR) 
is one of the textbook examples of high-affinity ligand binding. 
The interaction between MTX and DHFR is extremely well 
characterized in the literature both biochemically and 
structurally. DHFR is a monomeric protein and binds MTX with 
picomolar affinity. (Bolin 1982, Sasso 1994) " Even though MTX 
inhibits DHFR with such high affinity, both E. coli and S. 
cerevisiae grow in the presence of MTX when supplemented with 
appropriate nutrients. (Huang 1992) The ability of MTX to serve 
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both as an antibacterial and an anticancer agent is clear 
evidence that MTX has excellent pharmacokinetic properties. MTX 
is' known to be imported into cells via a specific folate 
transporter protein. MTX is commercially available and can be 
5 synthesized readily from simple precursors. MTX can be modified 
selectively at its g-carboxylate without disrupting its 
interaction with DHFR. (Kralovec 1989, Bolin 1982) There are 
several examples reported where MTX has been modified via its 
g-carboxylate to prepare affinity columns and antibody 
10 con j ugates . 

Given the number of cellular pathways that depend on cascades 
of dynamic protein-protein interactions, the ability to regulate 
protein oligomerization in vivo with small molecules should have 

15 broad applications in medicine and basic science. The key to 
realizing the potential of these small molecules both for the 
catalysis screen in the laboratory and for these biomedical 
applications is developing H1-H2 molecules that can be prepared 
readily and are efficient at inducing protein dimerization in 

20 vivo. 

Other handles HI and H2 may be for example, steroids, such as 
the Dexamethasone used herein; enzyme inhibitors, such as 
Methotrexate used herein; drugs, such as KF506; hormones, such 
25 as the thyroid hormone 3, 5, 3 1 -triiodothyronine (structure below) 



Ligands for nuclear receptors, such as retinoic acids, for 
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General cofactors, such as Biotin (structure below) 



HN NH 




C0 2 H 



and antibiotics, such as Coumermycin (which can be used to 
induce protein dimerization according to Perlmutter et al . , 
Nature 383, 178 (1996)). 



Derivative of the mentioned 
•linking without interfering 
used . 



compounds with groups suitable for 
with receptor binding can also be 



It has been found that the combination of the Mtx moiety 
containing CID with DHFR binding domain containing fusion 
protein is a highly useful and widely applicable. Mtx and the 
DHFR receptor present a particularly attractive chemical 
handle/receptor pair. In addition to having a picomolar binding 
affinity, the complex of an Mtx moiety and the DHFR binding 
domain is extremely well characterized. The excellent 
pharmacokinetic properties of Mtx make it an ideal moiety to be 
used in procedures where ease of importation into cells is 
required. 

Linking HI and H2 through a linker 

To illustrate how the handles HI and H2 may be linked together, 
several of the DEX-DEX compounds that have been synthesized to 
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date are shown in Figure 9. The linkers are all commercially 
available or can be prepared in a single step. The linkers vary 
in hydrophobicity, length, and flexibility. a series of DEX-DEX 
molecules have been synthesized (Fig. 9) . The DEX-DEX molecules 
5 shown in Figure 9 were prepared from Dexamethasone and the 
corresponding diamines. The C 20 a-hydroxy ketone of 

dexamethasone was oxidized using sodium periodate to the 
corresponding carboxylic acid in quantitative yield as 
described. The diamines are commercially available. The 
10 diamine corresponding to DEX-DEX 2 was prepared from a,a r - 
dibromo-m-xylene and aminoethanethiol and used crude. The 
diamines were coupled to the carboxylic acid derivative of 
dexamethasone using the peptide-coupling reagent PyBOP under 
standard conditions in 60-80% yield. 

15 

We have synthesized a DEX-MTX molecule. The retrosynthesis is 
shown in Figure 10. The synthesis is designed to be modular so 
that we can easily bring in a variety of linkers in one of the 
final steps as the dibromo- or diiodo-derivatives . For 

20 synthetic ease, the glutamate residue has been replaced with 
homocysteine. This replacement should be neutral because there 
is both biochemical and structural evidence that the g- 
carboxylate of methotrexate can be modified without disrupting 
DHFR binding. The final compound has been synthesized in 12 

25 steps in 1.3% overall yield. Also synthesized are analogous 
compounds where the a,a '-dibrorao-/n-xylene linker is replaced 
with 1 , 5-diiodopentane or 1 , 10-diiododecane . A similar route 
is used to prepare MTX-MTX molecules. 

30 Design of the protein chimeras 

The second important feature is the design of the protein 
chimeras. The yeast two-hybrid assay was chosen in the examples 
because of its flexibility. Specifically, the Brent two-hybrid 
system is used, which uses . LexA as the DNA-binding domain and 

35 B42 as the transcription activation domain. The Brent system 
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is one of the two most commonly used yeast two-hybrid systems. 
An advantage of the Brent system is that it does not rely on 
Gal4 allowing use of the regulatable Gal promoter. lacZ under 
control of 4 tandem LexA operators are used as the reporter 
gene. Initially, we chose to make simple LexA-GR and DHFR and 
B42-GR and DHFR fusion proteins that do not depart from the 
design of the Brent system. m the Brent system, the full 
length LexA protein which includes both the N-terminal DNA- 
binding domain and the C-terminal dimerization domain is used. 
The B42 domain is a monomer. The C-terminal hormone-binding 
domain of the rat Glucocorticoid Receptor is chosen because this 
domain was shown to work previously in the yeast three-hybrid 
system reported by Licitra, et al. Both the E. coli and the 
murine DHFRs are used because these are two of the most well 
15 characterized DHFRs . The E. coli protein has the advantage that 
methotrexate binding is independent of NADPH binding. 



10 



20 



25 



30 



is 
e 



Construction of the LexA- and B42-receptor fusions 
facilitated by the availability of commercial vectors for th 
Brent two-hybrid system. These vectors are shuttle vectors that 
can be manipulated both in bacteria and yeast. The LexA chimera 
is under control of the strong, constitutive alcohol 
dehydrogenase promoter. The B42 chimera is under control of the 
strong, regulatable galactose promoter. Both the GR and the two 
DHFR genes were introduced into the multiple cloning sites of 
the commercial LexA and B42 expression vectors using standard 
molecular biology techniques. The GR fusions are shown in 
Figure 11. The available restriction sites result in a three 
amino acid spacer between the two proteins in both the GR and 
the DHFR constructs. The plasmids encoding the LexA- and B42- 
fusion proteins were introduced in all necessary combinations 
into S. cerevisiae strain FY250 containing a plasmid encoding 
the lacZ reporter plasmid. 
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Three initial assays are conducted: (1) toxicity of the ligand 
and receptor, (2) cell permeability of the H1-H2 molecules as 
judged by competition in the yeast three-hybrid system, and (3) 
activation of lacZ transcription by the H1-H2 molecule as judged 
by X-gal hydrolysis. All of these experiments have been done 
as plate assays. The toxicity of the ligand and receptor is 
judged simply by seeing if either induction of the receptor 
fusions or application of the ligand to the plate impairs cell 
growth. Cell permeability is assessed based on the ability of 
an excess of DEX-DEX or DEX-MTX to disrupt DEX-FK50 6 induction 
of lacZ transcription in the yeast three-hybrid system. An 
excess of DEX-DEX or DEX-MTX should bind to all of the available 
LexA-GR chimera and disrupt transcription activation so long as 
the molecule is cell permeable and retains the ability to bind 
to GR. Effective protein dimerization by H1-H2 is assayed by 
activation of lacZ transcription. 

The DEX-DEX molecules were tested by all three assays. 
Preliminary results show that neither DEX nor GR are toxic. 
Under the conditions tried thus far, none of the DEX-DEX 
molecules tested are efficient at protein dimerization as judged 
by the lacZ transcription assay. We have been able to repeat 
the yeast three-hybrid result - activation of lacZ transcription 
using DEX-FK50 6 , in our lab. DEX-DEX 1 and DEX-DEX 5 have been 
assayed for cell permeability. At 1 uM DEX-FK506 and 10 uM DEX- 
DEX, DEX-DEX 1, but not DEX-DEX 5, decreases lacZ transcription 
in the yeast three-hybrid system by 50%. These results show 
that a DEX-DEX molecule is cell permeable and retains the 
ability to bind to GR. 

The protein chimeras can be varied in four ways: (1) invert the 
orientation of the B42 activation domain and the receptor; (2) 
introduce tandem repeats of the receptor; (3) introduce 
(GlyGlySer) n linkers between the protein domains; (4) vary the 
DNA-binding domain and the transcription activation domain. We 
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expect these experiments to be carried out over the next two 
years. The motivation for these experiments is that many 
different protein fusions have been reported in the literature 
and these types of modifications have been shown to be critical 
in these previous experiments. We have designed each of these 
experiments so that multiple variations can be made 
simultaneously. Inverting the orientation so that the receptor, 
not B42, is N-terminal is trivial. We will construct a generic 
vector that can be used with different receptors. Likewise, 
since several different DNA-binding domains and activation 
domains have been used with the yeast two-hybrid system, it is 
not difficult to vary these domains. 

An approach to introducing tandem repeats of the receptor and 
(GlyGlySer) n linkers that allows us to make multiple constructs 
simultaneously is provided. As illustrated for GR, the approach 
to making tandem repeats of the receptor is to use restriction 
enzymes with compatible cohesive ends (Fig. 14). The same PCR 
product can then be used to introduce each receptor unit. By 
including a BamHI restriction site immediately 5' to the gene 
encoding GR, a series of (GlyGlySer) n linkers can be introduced 
essentially as described. This approach relies on the fact that 
the BamHI site, GGA-TCC, encodes Gly-Ser. This combined 
approach will allow for the construction of multiple protein 
25 chimeras simultaneously. Since a lacZ screen us used, all of 
these constructs can be assayed simultaneously. 

Design of Linkers tha t respond to specific enzymatic activity 
The linkers, X-B-Y, may be designed according to the type of 
enzymatic activity desired. The linkers are readily synthesized 
by known techniques. For example, the following linkers may be 
used: 
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1) Glycosidase bond, which may be cleaved by a Glycosidase 
enzyme and formed by a Glycosyltrasf erase enzyme 




2) Phosphodiester bond. 
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3) Amide bond, which may be cleaved by protease and formed by 
peptidase or transpeptidase. An example of such a bond is a 
cephem bond shown in Figures 4 and 12. 



10 4) Ester bond. 
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O O 




S-MTX 



5) Aldol product bond, which is cleaved by a retro-aldolase and 
formed by Aldolase. 




Other Enzymes and classes of enzymes 

With a linker that contains an appropriate bond, the subject 
invention can screen derivatives of a large classes of enzymes. 
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A variety of enzymes and enzymes classes are listed on the World 
Wide Web beginning at prowl.rockefeller.edu/enzymes/enzymes.htm. 
All enzymes are given an Enzyme Commission (E.C.) number 
allowing it to be uniquely identified. E.C. numbers have four 
fields separated by periods, "a.b.c.d". The left-hand-most 
field represents the most broad classification for the enzyme. 
The next field represents a finer division of that broad 
category. The third field is adds more detailed information and 
the fourth field defines the specific enzyme. Thus, in the "a" 
field the classifications are oxidoreductases , transferases, 
hydrolases, lyases, isomerases, and ligases. Each of these "a" 
classifications are then further separated into corresponding 
"b", each of which in turn is separated into corresponding "c" 
classifications, which are then further separated into 
corresponding "d" classes. 



The subclasses of oxidoreductases are, for example: 
1.1 CH-OH, 1.2 aldehyde or oxo, 1.3 CH-CH, 1.4 CH-NH 2 , 1.5 CH- 

NH, 1.6 NADH OR NADPH, 1.7 other N-containing, 1.8 sulfur, 1.9 

heme, 1.10 diphenols and related, 1.11 peroxidases, 1.12 

hydrogen, 1.13 single donors + 0 2 , 1.14 paired donors + 0 2 , 1.15 

superoxide radical, 1.16 oxidizing metal ions, 1.17 CH 2 , 1.18 
reduced ferredoxin, and 1.19 reduced flavodoxin. 

The subclasses of transferases are, for example: 
2.1 one carbon, 2.2 aldehydes or ketones, 2.3 acyl, 2.4 
glycosyl, 2.5 alkyl or aryl, 2.6 N-containing, 2.7 P-containing, 
2.8 S-containing, and 2.9 Se-containing . 

The subclasses of hydrolases are, for example: 

3.1 ester, 3.2 glycosidic, 3.3 ether, 3.4 peptide, 3.5 C-N (non- 
peptide), 3.6 acid anhydride, 3.7 C-C, 3.8 C-halide, 3.9 P-N, 
3.10 S-N, 3.11 C-P, and 3.12 S-S. 



The subclasses of lyases are, for example: 

4.1 C-C, 4.2 C-0, 4.3 C-N, 4.4 C-S, 4.5 C-halide, and 4.6 P-0. 
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The subclasses of isomerase are, for example: 

5.1 racemases and epimerases, 5.2 cis-trans isomerases, 5.3 

intra-oxidoreductases, 5.4 intra-transf erases (mutases), and 5 . 5 
5 intramolecular lyases. 
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The subclasses of ligases are, for example: 

6.1 C-0, 6.2 C-S, 6.3 C-N, 6.4 C-C, and 6.5 P-ester. 

Each of the mentioned classes is further separated into sub, 
sub-classes, i.e. the "c" level, and then the "d" level. 

Moreover, new enzymes are discovered and are intended to be 
included within the scope of this invention, which is itself 
designed to evolve or discover such new enzymes. 

Design of reporter genes 

A reporter gene assay measures the activity of a gene's 
promoter. It takes advantage , of molecular biology techniques, 
which allow one to put heterologous genes under the control of 
a mammalian cell (Gorman, CM. et al., Mol . Cell Biol. 2: 1044- 
1051 (1982); Alam, J. And Cook, J.L., Anal. Biochem. 188: 245- 
254, (1990)). Activation of the promoter induces the reporter 
gene as well as or instead of the endogenous gene. By design 
the reporter gene codes for a protein that can easily be 
detected and measured. Commonly it is an enzyme that converts 
a commercially available substrate into a product. This 
conversion is conveniently followed by either chromatography or 
direct optical measurement and allows for the quantification of 
the amount of enzyme produced. 

Reporter genes are commercially available on a variety of 
plasmids for the study of gene regulation in a large variety of 
organisms (Alam and Cook, supra; . Promoters of interest can be 
inserted into multiple cloning sites provided for this purpose 
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in front of the reporter gene on the plasmid (Rosenthal, N 
Methods Enzymo. 152: 704-720 (1987); Shiau, A. and Smith, J.M ' 
Gene 67: 295-299 (1988)). Standard techniques are used to 
introduce these genes into a cell type or whole organism (e.g., 
5 as described in Sambrook, J., Fritsch, E.F. and Maniatis, T. 
Expression of cloned genes in cultured mammalian cells. In- 
Molecular Cloning, edited by Nolan, C. New York: Cold Spring 
Harbor Laboratory Press, 1989) . Resistance markers provided on 
the plasmid can then be used to select for successfully 
10 transfected cells. 
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Ease of use and the large signal amplification make this 
technique increasingly popular in the study of gene regulation. 
Every step in the cascade DNA -> RNA -> Enzyme -> Product -> 
Signal amplifies the next one in the sequence. The further down 
in the cascade one measures, the more signal one obtains. 

In an ideal reporter gene assay, the reporter gene under the 
control of the promoter of interest is transfected into cells, 
either transiently or stably. Receptor activation leads to a 
change in enzyme levels via transcriptional and translational 
events. The amount of enzyme present can be measured via its 
enzymatic action on a substrate. 

25 Host Cell 

The host cell for the foregoing screen may be any cell capable 
of expressing the protein or cDNA library of proteins to be 
screened. Some suitable host cells have been found to be yeast 
cells, Saccharomyces Cerevisiae, and E. Coli. 

This invention will be better understood from the Experimental 
Details which follow. However, one skilled in the art will 
readily appreciate that the specific methods and results 
discussed are merely illustrative of the invention as described 
more fully in the claims which follow thereafter. 
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EXPERIMENTAL DETAILS 



Example 1 

We have shown that Dex-Mtx can dimerize a LexA-DHFR and a B42- 
rGR protein chimera in vivo (Table I). (Lin, 1999) Dex-Mtx was 
assayed using both plate and liquid assays at extracellular 
concentrations of 1-100 m . No activation was observed at 
concentrations < 0.1 ^M. 100 m is the limit of Dex-Mtx 
solubility. Control experiments established that lacZ 

transcription is dependent on Dex-Mtx. There are only 
background levels of lacZ transcription when Dex-Mtx is omitted, 
LexA-DHFR is replaced with LexA, or B42-GR is replaced with B42 
Likewise, a 10-fold excess of Mtx competes out Dex-Mtx-dependent 
lacZ transcription. Interestingly, of the 10 protein chimera 
15 combinations tested, Dex-Mtx could only activate lacZ 
transcription in the context of the LexA-eDHFR and the B42- 
(Gly6) -rGR chimeras (Table 1). None of the 9 other protein 
combinations tested worked. This result is consistent with our 
view that the Dex-Mtx systems (and other dimerization systems) 
could be further improved both by biochemical and structural 
characterization and by variation of the protein chimeras and 
the reporter. 
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Table I. 



10 



15 



20 



Strain 3 



_ LexA Ch imera 



1 LexA-eDHFR c 

2 LexA-eDHFR 

3 LexA-eDHFR 

4 LexA-mDHFR f 

5 LexA-mDHFR 

6 LexA-mDHFR 
V LexA-rGR2 

8 LexA-rGR2 

9 LexA-(rGR2) 3 
10 LexA-(rGR2) 3 



=B42 == Cy £ mgra =5a 
B42-Gly 6 d -rGR2 e 
B42-rGR2 
B42- (rGR2) 3 
B42-Gly 6 -rGR2 
B42-rGR2 
B42- (rGR2) 3 
B42-eDHFR 
B42-mDHFR 
B42-eDHFR 
B42-mDHFR 



Dex-Mtx nimerizatinn b 
Yes 
No 
No 
No 
No 
No 
No 
No 
No 
No 



25 



*S. Cerevisiae strain FY250 containing P MW106 (the lacZ reporter 
plasmid), PMW103 (encoding the LexA chimera), and P MW012 (encoding the 
B42 chimera). b Dex-Mtx-dependent dimerization was determined using 
standard assays for lacZ transcription. See the text for details 
c the E. coli DHFR . - In some contracts a 6 Glycine linker was added 
between B42 and the rGR. e A mutant form Qf the hormone ^ binding 
of the glucocorticoid receptor (residues 524-795, Phe 620 ^Ser Cys 656 
-Gly) with increased affinity for Dex was used in these studies. ^the 
murine DHFR. 
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Example 2 

Cephalosporin Hydrolysis by the 

908R Cephalosporinase in the yeast fhr PP - huh,^ system 
xne supject invention is exemplifi ed using the c ^go^ts of the 
yeast three-hybrid system (Licitra, represented in Fig. 2, see 
also U.S. Patent No. 5, 928, 868). In this system DEX-FK50 6 
(exemplifying H1-H2) mediates dimerization of the protein 
fusions LexA-GR (representing reporter V-Hl receptor) and B42- 
FKBP12 (representing reporter W-H2 receptor) thus activating 
transcription of a lacZ reporter gene. The chemical handles HI 
and H2 and the protein dimerization assay, however, all can be 
varied. 
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In the subject invention, however, the yeast three-hybrid system 
is altered by inserting a BOND, B, as well as any required 
spacers X and Y, so as to form a small molecule having the 
structure H1-X-B-Y-H2. While there is ample precedent for 
5 small-molecule mediated protein dimerization, what remains is 
to show these assays can be used to select for catalysts. 
Cephalosporin hydrolysis by a cephalosporinase provides a simple 
cleavage reaction to demonstrate the selection (Fig. 3). The 
BOND, B in this example is cephem linkage susceptible to attack 
10 by caphalosporinase, such that hydrolysis of the 
cephalosporinase results in separation of the proteins and 
deactivation of the transcription of lacZ. 
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The E. cloacae 908R cephalosporinase is well characterized both 
biochemically (Galleni 1988; Galleni 1988; Galleni 1988; Monnaie 
1992) and structurally (Lobkovsky 1993) and is simple to 
manipulate. Several approaches have been developed for 
modifying cephalosporin antibiotics at the C7 * and C3 ' positions 
to improve their pharmacokinetic properties and to prepare pro- 
drugs. (Druckheimer 1988; Albrecht 1990; Vrudhula 1995; Meyer 
1995) 



Cephalosporin hydrolysis by the cephalosporinase can disrupt 
protein dimerization and hence be used to discriminate between 
25 cells containing active and inactive enzyme. Specifically, (1) 
(C.) DEX-CEPHEM- (C3')FK506 is synthesized; (2) DEX-CEPHEM- FK5 0 6 
is shown to dimerize LexA-GR and B42-FKBP12 and both DEX and 
FK506 is shown to disrupt the dimerization; (3) induction of the 
wild type cephalosporinase, but not an inactive Ser 64 variant, 
is shown to disrupt cephem-mediated protein dimerization; and 
(4) cells containing active cephalosporinase are identified 
based on loss of protein dimerization in a mock screen. A 
screen for loss of lacZ transcription is sufficient for the 
screen . 

35 



e retro-synthesis of DEX-CEPHEM- FK5 0 6 is shown in Figure 4; 



-44- 



it allows HI, H2, and the linker molecules to be varied. The 
allelic chloride intermediate 2 has been synthesized from cephem 
1 in 20% yield in four steps. Mild conditions for coupling H2- 
SH to the allelic chloride 2 using sodium iodide have been 
5 developed; DEX-SH can be coupled in 82% yield. 908R 
cephalosporinase variants have been constructed both with and 
without nuclear-localization sequences under control of GAL1 and 
MET25 promoters. All of these variants are known to be active 
in vivo by using the chromogenic substrate nitrocefin 
10 (Pluckthun 1987). Several S. cerevisiae strains suitable for 
this model reaction have been constructed. DEX-FK506 is know 
to dimerize LexA-rGR and B42-FKBP12 in these strain backgrounds 
(yeast three-hybrid system) . 



15 
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All of the components needed for the proof of principle have 
been prepared. Specifically, we have developed a modular 
synthesis of Dex-cephem-Mtx and constructed a S. cerevisiae 
strain suitable for the proof principle. The retro-synthesis 
of Dex-cephem-Dex is shown in Figure 12; it allows HI, H2, and 
the linker molecules to be varied to optimize the cephem 
substrate. We have synthesized the allylic chloride 

intermediate 2 from cephem 1 in 20% yield in four steps. We 
have developed mild conditions for coupling H2-SH to the allylic 
chloride 2 using sodium iodide; Dex-SH can be coupled in 82% 
25 y ±eld - We have constructed strain 

FY250/pMW106/pMW2rGR2/ P MW3FKBP12 and shown that Dex-FK506 can 
still mediate dimerization of LexA-rGR and B42-FKBP12 in this 
strain. The strain provides an additional marker for the 
enzyme, grows well on galactose and raffinose, and replaces all 
of the amp* markers with kan* or spec* markers. In addition, we 
have constructed several constructs for the galactose- or 
methionine-regulated overexpression of the cephalosporinase. 
Based on hydrolysis of the chromagenic substrate nitrocefin 
(Pluckthun, 1987) we have shown that the cephalosporinase is 
active in the FY250 background. 
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The basis for catalysis by the cephalosporinase is studied using 
combinatorial techniques. Understanding the mechanism is 
important for anticipating future routes to antibiotic 
resistance and for developing new cephalosporin antibiotics. 



Dex-cephem-Mtx induces protein dimerization in 



vivo 



Preparation of a Dex-cephem-Mtx (cleavable cephem linker) 

The cephem substrates were designed such that introduction of 
10 the Dex and Mtx ligands would not interfere with 
cephalosporinase hydrolysis of the cephem core and so that a 
variety of Dex-cephem-Mtx substrates could be synthesized 
readily from commercially available materials. (The chemistry 
of the b-lactams; Durckheimer 1988; Albrecht 1990; Meyer 1995; 
Zlokarnik 1998) We synthesized four potential Dex-cephem-Mtx 
substrates from a commercial amino- chloro- cephem intermediate. 
Dexamethasone was coupled to the C7 amino group of the cephem 
core via aminocarboxylic acids of different lengths, and 
methotrexate to the C3' chloro group via aminothiols of 
different lengths. All four compounds were prepared from three 
components in 3-4 steps in 10-30% overall yield. 



The critical issue was whether introduction of the cephem. linker 
would impede either the cell permeability or the dimerization 
activity of the Dex-Mtx CID. We screened all four Dex-cephem- 
Mtx compounds using the yeast two-hybrid lacZ transcription 
assay and determined that all four compounds are cell permeable 
and that two of these compounds are capable of inducing protein 
dimerization in vivo, as shown in Figure 15. Based on these 
results, it appears that the length of the linkers between the 
cephem core and the Dex and Mtx ligands are important; the 
cephem core must not be too close to the receptor or it will 
prevent access to the receptor. These results support the 
general feasibility of preparing CIDs with cleavable linkers and 
35 using these compounds in vivo with the catalysis screen. 
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The ability of this Dex-cephem-MTX CID to serve as a read-out 
for catalysis is evaluated using the well-studied enzymatic 
reaction, cephem hydrolysis by a cephalosporinase . Hydrolysis 
of the lactam bond results in expulsion of the leaving group at 
5 the C3' position, effectively breaking the bond between Dex and 
Mtx. 




Having identified Dex-cephem-Mtx substrates that are efficient 
dimerizers in the yeast two-hybrid assay, the next step is to 

10 demonstrate that the screen can discriminate between active and 
inactive enzymes. The penicillin-binding protein (PBP) from 
Streptomyces R61 provides a good control "inactive" enzyme to 
compare to the active Q908R cephalosporinase . (Kelly 1986; 
Ghuysen 1991) Cephalosporinases are believed to have evolved 

15 from PBPs. (Ghuysen 1991; Knox 1996) Both enzymes have the same 
three-dimensional fold and follow the same catalytic mechanism 
involving an acyl-enzyme intermediate . (Kelly 1986, Lobkovsky 
1993) PBPs bind to cephems with high affinity, form the acyl- 
enzyme intermediate rapidly, but hydrolyze the acyl-enzyme 

20 intermediate much more slowly than do cephalosporinases. We 
have introduced both the Q908R cephalosporinase and the R61 PBP 
into yeast shuttle vectors that place the enzymes under control 
of either a galactose- inducible or a methionine-repressible 
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promoter. Based on plate assays using the chromagenic substrate 
nitrocefin, (Pluckthun 1987) the Q908R enzyme was expressed in 
an active form in yeast with either promoter. This assay cannot 
detect PBP activity. 

5 

The Dex-cephem-Mtx CID screen distinguish between the 
cephalosporinase and the PBP. Yeast strains containing the 
cephalosporinase hydrolyze the cephem linkage rapidly, 
disrupting lacZ transcription. The PBP, on the other hand, 
10 hydrolyze the cephem linkage too slowly to change the levels of 
lacZ transcription significantly. 

Can the CID screen detect catalytic activity? 

Strong support for the feasibility of using CIDs with cleavable 

15 linkers to detect catalytic activity is provided by in vivo 
selections for protease activity based on cleavage of internal 
protease sites engineered in a variety of proteins, including 
Gal4. With an active Dex-cephem-Mtx CID in hand, our next step 
is to find conditions where the CID screen gives an enzyme- 

20 dependent signal. We envision two scenarios which should result 
in an enzyme-dependent signal: (1) overexpression of the enzyme 
relative to the LexA- and B42-reporter proteins and (2) 
expression of the enzyme prior to expression of the LexA- and 
B42-reporter proteins. The Brent Y2H vectors currently employed 

25 in the lab will have to be modified to allow for control over 
the levels and timing of LexA- and B42-expression . As supplied, 
the Brent vectors have the LexA fusion protein under control of 
the strong, constitutive alcohol dehydrogenase promoter ( P ADH -) 
and the B42 fusion protein under control of the strong 

30 galactose-inducible promoter (P GAL ) . Both vectors contain the 
high-copy yeast 2\x origin of replication. We plan simply to 
place the LexA fusion protein under control of a galactose- 
inducible promoter, just like B42. The GAL promoter is the most 
tightly regulated promoter available in yeast and is induced by 

35 galactose and repressed by glucose. It can be fully repressed, 
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and it can direct expression of a range of intermediate protein 
concentrations by varying the relative percentages of glucose 
and galactose in the growth media. Thus, with both LexA and B42 
under control of Gal promoters, these reporter proteins can be 
turned off and then on or expressed at intermediate 
concentrations in concert. If this approach does not work, 
there are many other ways to tune the sensitivity of the system. 
The expression of the enzyme, LexA, and B42 can all be 
controlled using other inducible or constitutive promoters or 
by integrating LexA and B42 into the chromosome. The lacZ 
reporter gene can be replaced with other chromagenic reporters 
or selectable markers. Alternatively, the sensitivity of the 
system can be tuned by varying the substrate : product ratio by 
adding both Dex-cephem-Mtx (substrate) and Dex and Mtx 
("product") to the growth media. 

Once conditions were found where we can detect enzyme-dependent 
cleavage of the cephem linker, we carried out a mock screen as 
a proof-of-principle experiment. Specifically, plasmids 

encoding the cephalosporinase and the PBP in a ratio of 1:99 
will be introduced into a yeast strain carrying the appropriate' 
protein chimera and reporter genes . Cells harboring the 
cephalosporinase should be white, while those containing the PBP 
should be blue. Plasmids from these colonies will be isolated 
and sequenced to confirm the identity of the expressed enzyme. 

Level of catalytic activity detected using the CID screen 
While these experiments will show that the CID screen can detect 
catalytic activity, they will not show that the screen can be 
used to amplify enzymes with low levels of catalytic activity. 
Thus, our next step is to use cephalosporinase mutants with a 
range of catalytic efficiencies to quantify and then optimize 
the sensitivity of the system. Many b-lactamase mutants, either 
found in clinical settings or constructed by site-directed 
mutagenesis, have been fully characterized kinetically. Known 




-49- 

mutants of the Q908R cephase, the E. cloacae P99 cephase (99% 
identical), and the E. coli K12 AmpC b-lactamase (71% 
homologous) are available spanning a wide range of k cat , K m/ and 
^cat/Km values (Table II). To accurately gauge the relative 
5 activities of the mutants in the CID and amp R screens, we will 
determine kinetic rate constants for the corresponding Q908R 
cephase variants with the Dex-cephem-Mtx and ampicillin 
substrates and nitrocefin as a control. The Q908R cephase 
variants will be constructed in the E. coli expression vector 
10 by site-directed mutagenesis, using a PCR-based method. These 
proteins will then be purified by nickel-affinity 
chromatography, and rate constants will be determined by UV 
spectroscopy, monitoring the disappearance of absorbance due to 
the b-lactam bond. 

15 

After determining the activity of the mutants with Dex-cephem- 
Mtx and ampicillin in vitro, these same mutants are tested in 
the CID and amp R screens. In addition to plate and more 
quantitative liquid lacZ assays, the mutants will be evaluated 

20 using a ura3 reporter gene. Ura3, which encodes orotidine-5 ' - 
phosphate decarboxylase and is required for uracil biosynthesis, 
is used routinely as a selectiable marker in yeast. Since large 
numbers of protein variants need to be screened for the 
evolution experiments, it will be important to move from a 

25 screen to a growth selection. Ura3 has the advantage that it 
can be used both for positive and negative selections— positive 
for growth in the absence of uracil and negative for conversion 
of 5-f luoroorotic acid (5-FOA) to 5-f luorouracil , a toxic 
byproduct. Cleavage of the cephem bond and disruption of ura3 

30 transcription will be selected for based on growth in the 
presence of 5-FOA. The advantage to the 5-FOA selection is that 
the timing of addition of both the Dex-cephem-Mtx substrate and 
5-FOA can be controlled. Several other reporter genes, however, 
have been reported. The mutants are evaluated in E, coli using 

35 nitrocefin screens and amp R selections. Mutants with higher 



• 
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activity (k cat /K m ) will still show an enzyme-dependent signal 
(failure to hydrolyze X-gal or growth in the presence of 5- 
FOA/nitrocef in hydrolysis or resistance to ampicillin) , but at 
some point these assays will not be able to detect the less 
5 active mutants. In addition to suggesting what range of 
activities can be detected with these assays, these experiments 
may bring surprising results. For example, it may be that 
detection correlates more strongly with k cat than with K M or 
kcat/KM- Assuming a dynamic range of >1000, we will proceed with 

10 the enzyme evolution experiments. Otherwise, we will focus on 
optimizing the sensitivity of the screen until we reach this 
level of sensitivity. The optimization experiments will 
continue along the same lines as the proof-of -principle 
experiments, varying the levels and timing of both protein 

15 expression and addition of the substrate and product, except 
they will be carried out with mutant Cephases at the limit of 
detection. 

20 Table II. Wild-type and mutant enzymes are shown with their 
kinetic rate constants with the chromogenic cephalosporin 
nitrocefin, as well as the percentage of wild-type k cat /K m as 
calculated in that experiment. 



Enzyme 


K m (pM) 


k cat (s- 1 ) 


KJKn ( M' 1 s 1 ) 


% WT 


E. cloacae P99 wt 


25 ± 1 


780 ± 30 


3.1 x 10 7 


100 


E. cloacae Q908R wt 


23 ± 1 


780 ± 30 


3.4 x 10 7 


100 


K12 AmpC wt 


500 ± 100 


490 ± 90 


1.0 x 10 6 


100 


P99 286-290 TSFGN 


19 ±0.5 


261 ± 7 


1.37 x 10 7 


96 


P99 286-290 LTSNR 


43 ±2 


330 ± 11 


7.7 x 10 6 


54 


P99 286-290 NNAGY 


31 ± 11 


53 ± 10 


1.7 x 10 6 


12 


K12 Y150S 


108 ± 21 


2.11 +0.12 


1.9 xlO 4 


"1 


K12 Y150E 


356 ± 34 


0.51 ±0.03 


1.4 xlO 3 


"0.1 


Q908R S64C 


> 1000 


> 18 


1.76 x 10 4 


0.05 
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Example 3 

CIDs can be used to screen cDNA libraries based on biochemical 
function. This glycosidase example is used to determine the 
best method for expressing the cDNA clones and to optimize the 
5 screening process. 



Proof of Principle - P-Galactosidase Activity Assays 

Table III explains the components of each strain. Each strain 

was constructed from the parent yeast strain FY250 and also 

10 contains the pMW106 plasmid, which has the LacZ reporter gene 
that is turned on only in when the LexA DNA binding domain and 
the B42 activation are brought in tot he vicinity of each other. 
We use several different strains because we use DHFR from two 
different species, mDHFR is from murine, while eDHFR is from 

15 E.coli. We are asl oable to switch the small moleculebinding 
domains. For example, the strain containing LexA-eDHFR with 
B42-rGH2 is a different strain and behaves differently from the 
strain containing LexA-rGR2 with B4 2-eDHFR . We also put in 
short 6 amino acid linkers between the two domains of our 

20 protein chimeras and thus these are different strain as well. 

Next, we have chosen to screen a yeast cDNA library for proteins 
with glycosidase activity (Figure 19) . 



• 
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Table III- Identification of stains used. (Key: eDHFR=E . coli 
Dihydrof olate Reductase; rGR2=s tereoid binding domain of rat 
Glucocorticoid Receptor (aa 524-795) with point mutations; 
(rGR2) 3=trimer of rGR2; mDHFR=murineDihydrof olate Reductase; 
5 gly6=6 amino acid linker conaining 6 glycines; (GSG) 2=6 amino 
acid linker containing glycine- serine-glycine-glycine-serine- 
glycine . ) 



Strain 


LexA 


B42 


V O f O T . 


cunrn 


n |wC r r;Ro 
yiyoi o 


V *f %JO T 


cL/TIt Hi 






mTH-IFR 




v*f yo y 


mUMrri 


riarLc 


WCACV 
VoUO T 


njiruc: 


eurinri ■ 


V507 Y 


Rarui 


niJrlrri 


V501 Y 


( G oG J^eUrl r n 


(GoG)rGR2 


V504Y 


(G SG)2nriiJn r r 


(GSG)rGR2 


\ / A ft A V 

V494Y 


eUnrn 


(GoG)rGri2 


W4Q7Y 


1 1 unrn 




V510Y 


(GSG)2rGR2 


(GSG)2eDHFR 


V512Y 


(GSG)2rGR2 


(GSG)2mDHFR 


V498Y 


(GSG)2eDHFR 


rGR2 


V502Y 


(GSG)2mDHFF 


rGR2 


V499Y 


(GSG)2eDHFR 


gly6rGR2 


V503Y 


(GSG)2mDHFF 


gly6rGR2 


V509Y 


rGR2 


(GSG)2eDHFR 


V511Y 


rGR2 


(GSG)2mDHFR 


V506Y 


(GSG)2rGR2 


eDHFR 


V508Y 


(GSG)2rGR2 


mDHFR 


V513Y 


eDHFR 


(rGR2)3 


V514Y 


mDHFR 


(rGR2)3 


V517Y 


(rGR2)3 


eDHFR 


V518Y 


(rGR2)3 


mDHFR 


V515Y 


(GSG)2eDHFR 


(rGR2)3 


V516Y 


(GSG)2mDHFF 


(rGR2)3 



Strain 


LexA 


B42 




V134Y 


Sec16p 


Sec6p 


positive control 


V133Y 


Sec13 


Sec6p 


positive control 


V381Y 


blank 


blank 


negative control 


V379Y 


eDHFR 


blank 


negative control 


V560Y 


blank 


(GSG)2rGR2 


negative control 
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p-Galactosidase Activity Assay Results 

The results in Table IV are averages of two separate trials. 
Each strain was' examined with small molecules and without small 
molecules. The absolute activity is given as the p- 
5 galac'tisidase activity with small molecule subtracted from the 
P-galactosidase activity without small molecule. The average 
p-galactosidse activity for a strain without small molecule 
(i.e. the negative control) was about 100 p-galactosidase units. 
V133Y is a positive control and shows p-galactosidase activity 
10 regardless of the presence of small molecule. The (3- 
galactosidase activity of strain V494Y using varying 
concentrations of D8M is shown in Figure 18. 



Table IV - p-galactosidase Activity Assays 



B-qal activity 



Strains 


1 uM DM1 


1 uM D8M 


1 uM D10M 


V375Y 


4978 


5210 


9993 


V493Y 


5753 


5555 


5812 


V496Y 


-30 


-27 


740 


V495Y 


15 


38 


513 


V505Y 


557 


2532 


1160 


V507Y 


-7 


-6 


-4 4 


V501Y 


4662 


6660 


2286 


V504Y 


12 


30 


556 


V494Y 


9976 


10568 


9398 


V497Y 


-8 


24 


308 


V510Y 


601 


3163 


2314 


V512Y 


-1 


-4 


6 


V498Y 


4735 


5442 


2926 


V502Y 


21 


30 


497 


V499Y 


4368 


7012 


4013 


V503Y 


-5 


45 


1 132 


V509Y 


307 


2734 


2028 


V511Y 


-113 


-129 


-60 


V506Y 


519 


3867 


2561 


V508Y 


0 


-5 


5 



Controls 


B-qal activity I 


V133Y 


1912 


(Positive Control) 


No small 


96.9374475 


(Negative Control) 


molecules 
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Compounds Used 




(D8M) 





(D7CM) 




(D8CM) 
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Glycoconjugates are the most functionally and structurally 
diverse molecules in natures. (Varki, 1993) Moreover, it is now 
well established that carbohydrates and protein- and lipid- 
bound saccharides play essential roles in many important 
5 biological processes, including cell structure, protein 
targeting, and cell-cell interactions . (Varki, 1993) 
Accordingly, glycosidases with a broad array of substrate 
specificities are required to breakdown and modify 
polysaccharides, glycoproteins, and glycolipids. 

10 

Using CIDs with structurally diverse carbohydrate linkers, we 
screen a S. cerevisiae cDNA library based on glycosidase 
activity- There are many examples of well-characterized 
glycosidases identified in other organisms that are yet to be 

15 identified in S. cerevisiae . a-Amylase (Sogaard, 1993; Vihinen, 
1990; Qian, 1994 ; Wiegand, 1995; Fujimoto, 1998; Wilcox, 1984) 
and xylanase (Wong, 1988; Biely, 1997) are endo-glycosidases that 
break down polysaccharides involved in energy storage and cell 
structure, respectively. Glycoproteins are synthesized by 

20 * modification of a core glycoside. The GlcNAcbl®Asn and 
GlcNAcbl®4GlcNAc linkages in Asn-linked carbohydrates are 
cleaved by peptide-N 4 - (N-acetyl-b-glucosaminyl ) asparagine 
amidase (PNGase F) and endo-b-N-acetylglucosaminidases (Endo H 
and Endo Fl ) , respectively. (Tarentino, 1990; Tarentino, 1992; 

25 Robbins, 1984; Trimble, 1991) Since each of these enzymes are 
endo-glycosidases, the CID ligands should not interfere with the 
enzyme-catalyzed reaction. Likewise, by making a small library 
of carbohydrate linkers, we screen in an undirected fashion. 

30 The diversity of naturally occuring carbohydrates requires us 
to make a library of Dex-Mtx CIDs with different carbohydrate 
linkers. Recent advances in the synthesis of oligosaccharides, 
both in the coupling methods ( Schmidt , 1986; Toshima, 1993; 
Boons, 1996) and in the solid-phase synthesis, ( Danishef sky, 

35 1993; Seeberger, 1998; Yan, 1994; Liang, 1996) make it possible 
to synthesize these linkers. We have chosen to use a method 



-56- 

developed by Kahne and co-workers which uses anomeric sulfoxides 
as glycosyl donors and synthesizes carbohydrates from the 
reducing to the non-reducing end. (Yan, 1994; Liang, 1996) This 
method can be used both in solution and on solid-support, can 
form both a- and b-glycosidic bonds, and utilizes readily- 
synthesized intermediates- Several alternative methods, 
however, are available, including Wong and co-workers' one-pot 
solution synthesis (Zhang, 1999; Ye, 2000) and the solid-phase 
glycal strategy reported by Danishefsky and co- 
workers . (Danishef sky, 1993; Seeberger, 1998) 

We screen a yeast cDNA library based on glycosidase activity 
using Dex-Mtx CIDs with cleavable glycosidic linkers (Fig. 12) . 
Concurrently, we identify glycosidases from a S. cerevisae cDNA 
library by screening for cleavage of CIDs with glycosidic 
linkages. The Dex-Mtx yeast two-hybrid assay is used as the 
screen by replacing Dex-Mtx with Dex-oligosaccharide-Mtx . 
First, we carry out a control where we screen for a known 
glycosidase, chitinase, using a defined substrate. Second, we 
screen for unknown glycosidases by using a small library of 
substrates with different glycosidic bonds. 

Screen of a S. cerevisiae cDNA Library Based on Glycosidase 
Activity : 

Using Dex-Mtx CIDs with cleavable oligosaccharide linkers, we 
screen a S. cerevisiae cDNA library based on glycosidase 
activity. As a control, we screen for a known S. cerevisiae 
glycosidase, chitinase. Then, we synthesize a small library of 
Dex-carbohydrate-Mtx substrates and screen the S. cerevisae cDNA 
library to identify glycosidases from the >3000 ORFs of unkown 
function in S. cerevisiae . 

Introduction of a S. cerevisiae cDNA library into the CID 
selection strain 

The first step of both the chitinase control and the random 
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oligosaccharide library is to introduce a S. cerevisiae cDNA 
library into the CID selection strain. We use a cDNA library 
reported by Fields and co-workers. (Martzen, 1999) In this 
library, each cDNA clone is expressed as a GST-fusion protein 
5 under control of a copper-inducible promoter on a shuttle vector 
with a leu2. marker. (Martzen, 1999; J. R. Hudson, 1997) 
Transformation efficiencies in yeast are ca. 10 6 -10 7 using the 
lithium acetate method, so there is ample redundancy to screen 
all 6, 000 ORFs in S. cerevisiae. Active clones can be 
10 identified by sequencing the plasmid. For the chitinase control 
experiment, we make a library with a subset of cDNA clones to 
test different approaches for expressing the cDNA clones. 

Can the S. cerevisiae chitinase be identified using the CID 

15 selection? 

We begin by screening a S. cerevisiae cDNA library for a known 
glycosidase, chitinase. Chitinase hydrolyzes chitin, polymers 
of b-1, 4-linked AT-acetylglucosamine (GlcNAc) that play a 
structural role in the cell. (Muzzarelli, 1977) Chitinases from 

20 several organisms, including S. cerevisiae, have been cloned and 
characterized. (Correat, 1982; Kuranda, 1987; Kuranda, 1991) It 
is known that this enzyme can hydrolyze oligomers of b-1,4- 
GlcNAc ranging from trimers to heterogeneous polymers, 
suggesting that CIDs such as Dex- (GlcNAc) n -Mtx should be 

25 efficient substrates for this enzyme. Several efficient 
syntheses of p-1, 4-linked GlcNAc have been published. (Banoub, 
1992) 

The retro-synthetic analysis of our Dex- (GlcNAc) n -Mtx CID 
30 substrate is shown in Figure 20. 

The growing carbohydrate chain is linked to the solid support 
via the Glu portion of Mtx. The glycosidic linkages are formed 
essentially as reported by Kahne and co-workers using sulfoxide 
35 glycosyl donors. (Yan, 1994; Liang, 1996) The final carbohydrate 
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is introduced as a Dex derivative, and the Mtx synthesis is 
completed prior to cleavage from the solid support . This 
synthesis allows the oligosaccharide linker to be varied and 
both the Dex and the Mtx ligand to be introduced before cleavage 
5 from solid support. Alternatively, the synthesis can be carried 
out in -solution, (Kahne, 1989) or other methods for carbohydrate 
synthesis can be employed. (Zhang, 1999; Ye, 2000; Danishefsky, 
1993; Seeberger, 1998) We start with a GlcNAc tetramer as 
trimers have been shown to be the shortest efficient substrates 
10 for chitinases. (Watanabe, 1993) 

Initially, lacZ plate assays are used to verify that the Dex- 
(GlcNAc) n -Mtx substrates are efficient dimerizers in the yeast 
three-hybrid assay. The results with Dex-cephem-Mtx support the 

15 feasibility of incorporating structurally diverse linkers into 
the CIDs. If the initial chitinase substrates, however, are not 
efficient dimerizers, the linkers between the CID ligands and 
the GlcNAc oligomer can be varied, or alternate dimerization 
assays can be tested. Since large numbers of cDNA clones need 

20 to be screened, the transcriptional read-out of the yeast three- 
hybrid assay may be changed from a screen to a growth selection. 
Specifically, ura3, which encodes orotidine-5' -phosphate 
decarboxylase and is required for uracil biosynthesis, replaced 
lacZ as the reporter gene. (Boeke, 1984) Ura3 has the advantage 

25 that it can be used both for positive and negative 
select ions— positive for growth in the absence of uracil and 
negative for conversion of 5-f luoroorot ic acid (5-FOA) to 5- 
f luorouracil , a toxic byproduct. Cleavage of the glycosidic 
bond and disruption of ura3 transcription is selected for based 

30 on growth in the presence of 5-FOA. The advantage to the 5-FOA 
selection is that the timing of addition of both the Dex- 
(GlcNAc) n -Mtx substrate and 5-FOA can be controlled. Several 
other reporter genes, however, can be used. 

35 One problem that has the potential of occurring is that the Dex- 
(GlcNAc) n -Mtx substrate becomes unstable either because of its 
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intrinsic half-life in water or because it is turned over by 
cellular glycosidases. However, if the substrate has a short 
half-life in water, the assay conditions can be modified so that 
the substrate is added late in the assay after the cells have 
5 grown to a high density, the substrate can be continuously 
replenished, or the pH of the media can be buffered. Turnover 
by cellular glycosidases can simply be seen as an assay in and 
of itself. Using traditional genetic approaches, random 
mutations can be introduced into the S. cerevisiae genome or the 
10 tagged knock-out strains of Winzeler et al. can be 
used. (Winzeler, 1999) Cells containing a disruptive mutation 
in the gene or genes cleaving the Dex- (GlcNAc) n -Mtx substrate 
can be selected for by growth in the absence of uracil. 

15 The final step is to use the Dex- (GlcNAc) n -Mtx substrate to pull 
out chitinase from a 5. cerevisiae cDNA library. As described 
above, a 5-FOA growth selection is used to screen the Fields 
cDNA library. In the absence of chitinase, Dex- (GlcNAc ) n -Mtx 
induces ura3 transcription, and 5-FOA is converted to the toxic 

20 byproduct 5-f luorouracil . Thus, only cells containing active 
chitinase, or another enzyme that can cleave the substrate, 
survive. The cDNA clone is readily identified by isolating the 
plasmid, sequencing the N-terminus of the clone, and comparing 
this sequence to that of the S. cerevisiae genome. The 

25 advantage of using a known enzyme is that the enzyme can be 
tested independently or used to spike the cDNA library. The 
enzyme can be purified, and the Dex- (GlcNAc) n -Mtx substrate can 
be tested in vitro. We can vary the format of the cDNA library, 
the Dex- (GlcNAc) n -Mtx substrate, the screen, or the assay 

30 conditions, or even use a different glycosidase as a control. 

Can glycosidases be identified from the >3000 unassigned ORFs 

in 5. cerevisiae using the CID selection? 

The next step is to determine the activity of the >3000 ORFs in 
35 S. cerevisiae with unknown function. To detect glycosidase 
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activity, the screen is run exactly as with the chitinase 
control except using Dex-oligosaccharide-Mtx substrates with 
different glycosidic linkages. The glycosidic linkages is based 
on the types of carbohydrates and glycocon j ugates naturally 
5 occuring in yeast. Several activities, including 

amylase, -(Sogaard, 1993; Vihinen, 1990 ; Qian, 1994; Wiegand, 
1995; Fujimoto, 1998; Wilcox, 1984) xylanase, (Wong, 1988; Biely, 
1997; Georis, 1999) and endo-iV-acetylglucosamine hydrolysis 
activity, (Tarentino, 1990; Tarentino, 1992; Robbins, 1984; 
10 Trimble, 1991) can be targeted specifically. 

Dex-Mtx CIDs with different oligosaccharide linkers are prepared 
using the same strategy as for the chitinase substrate (above) . 
The sulfoxide glycosyl donor method for carbohydrate synthesis 

15 allows a variety of sugar monomers to be introduced. (Kahne, 
1989) Moreover, both the regio- and stereo-chemistry can be 
controlled. (Yan, 1994; Liang, 1996) As with the chitinase 
control, the 5-FOA growth selection is used to identify enzymes 
that cleave the various glycosidic linkages. Each glycoside 

20 subsrate is tested individually. Mixtures of substrates cannot 
be tested because the uncleaved substrates would continue to 
activate ura3 transcription. If the screen does not pick up any 
enzymes, known glycosidases from other organisms may be used as 
controls both for the growth selections and to test the Dex-Mtx 

25 substrates in vitro. 

the foregoing permits the characterization of in vitro activity 
and biological function of glycosidases identified using the CID 
screen. Similarly, cDNA libraries from other organisms can be 
30 screened. . The Dex-Mtx substrates can be used to evolve 
glycosidases with unique specificities. In addition, the cDNA 
screen can be extended to other classes of enzymes, such as 
proteases . 
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Example 4 

Evolution of a Diels-Alderase 

The Diels-Alder reaction is one of the key carbon-carbon bond 
forming reactions in synthetic organic chemistry (Fig. 7B) . 
(Oppolzer 1991; Carruthers 1990) Surprisingly, no natural Diels- 
Alderases are known, although catalytic antibodies have been 
generated for this reaction. (Braisted 1990; Hilvert 1985, 1994, 
1989; Suckling 1992; Gouverneur 1993) Using the selection 
strategy outlined below the B . subtilis chorismate mutase is 
evolved into a "Diels-Alderase" that can catalyze the 
cycloaddition of 1-carbamyl-l , 3-butadiene and 2-propanoic acid 
(Fig. 7B) . 

Chorismate mutase (CM) catalyzes the Claisen rearrangement of 
chorismate to prephenate (Fig. 7A) . Like the Diels-Alder 
reaction, the Claisen rearrangement is a pericyclic reaction 
with a six-membered transition state (ts) . This similarity — and 
inspection of the active site--suggests that the chorismate 
mutase active site can accommodate a Diels-Alder transition 
state. The structures of the B. subtilis and E. coli enzymes 
and of an antibody that catalyzes this reaction in complexes 
with a ts analog have been determined to high resolution. (Chook 
1994; Lee 1995; Haynes 1994) Although it is homotrimeric, we use 
the chorismate mutase from B . subtilis because it has an open 
active site and is monof unctional , nonallosteric, and easy to 
manipulate. (Gray 1991; Gray 1990) 

To create proteins with catalytic efficiencies that rival 
natural enzymes we mimic the evolution of natural enzymes. 
First evolve a rudimentary catalyst from an existing protein 
scaffold by demanding that catalysis provide the cell with a 
selective advantage. Then improve the catalytic efficiency of 
the rudimentary catalyst by subjecting it to further 
randomization and recombination and increasing the stringency 
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of the selection. 

We begin with the simple substrates shown in Figure 8. The CM 
structural data allows design the diene and dienophile (1) to 
5 utilize the electrostatic environment in the CM active site and 
(2) to incorporate HI and H2 without disrupting substrate 
binding . It may be necessary, however, to incorporate 

additional functionality to improve substrate binding or to 
modulate the elet rophilicity of the dienophile to prevent 
10 reaction with cellular components. 

To select for a Diels-Alderase , we first show that the 
cycloaddition product, but neither substrate, can mediate 
protein dimerization . A retro- synthesis of the diene and 

15 dienophile is shown in Figure 8. Then we evolve a Deils- 
Alderase from libraries of CM variants using a LEU2 selection 
in which the media is supplemented with the substrates Hl-diene 
and H2-dienophile . Only cells containing active enzyme catalyze 
the covalent coupling of the substrates, induce protein 

20 dimerization and LEU2 transcription, and hence survive on media 
without leucine. In practice, we (1) isolate a few rudimentary 
Die Is -Alder ases by supplementing the media with low 
concentrations of leucine; (2) confirm the activity of these 
enzymes in vitro using purified protein; (3) improve the 

25 catalytic efficiency of these initial catalysts by further 
randomization and recombination under more stringent LEU2 
selection conditions; and finally (4) characterize both the 
rudimentary and improved catalysts in vitro using purified 
protein. The evolution of a Diels-Alderase establishes that the 

30 selection strategy can be used to create synthetically useful 
protein catalysts for chemically demanding transformations . 
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